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Abstract: Consider a set of terminal nodes K that belong to a network whose 
nodes are connected by links that fail independently with known probabilities. 
We introduce a method for estimating any performability measure that depends 
on the hop distance between terminal nodes. It generalises previously introduced 
Monte Carlo methods for estimation of the PC-reliability of networks with vari- 
ance reduction compared to crude Monte Carlo. They are based on using sets of 
edges named d-pathsets and d-cutsets for reducing the variance of the estimator. 
These sets of edges, considered as a priori known in previous literature, heaviliy 
affect the attained performance; we hereby introduce and compare a family of 
heuristics for their selection. Numerical examples are presented, showing the 
significant efficiency improvements that can be obtained by chaining the edge 
set selection heuristics to the proposed Monte Carlo sampling plan. 

Key-words: heuristics, pathset, cutset, Monte Carlo, rare events, reliability, 
performability, bounded length, diameter constraints 
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Une methode de simulation pour l'estimation de 
la performabilite des reseaux utilisant pathsets et 
cutsets calcules heuristiquement 

Resume : Considerez un ensemble de noeuds terminaux K appartenant a un 
reseau dont les nceuds sont relies par des liaisons qui echouent independamment 
avec des probabilites connues. Nous presentons une methode pour estimer 
n'importe quelle mesure de performabilite qui depend de la distance en sauts 
entre les nceuds terminaux. Elle generalise des methodes de Monte Carlo prece- 
demment introduites pour l'estimation de la K-fiabilite des reseaux avec reduction 
de la variance par rapport a Monte Carlo standard. Ces methodes sont basees 
sur l'utilisation d'ensembles d'aretes designes d-pathsets et d-cutsets pour reduire 
la variance de l'estimateur. Ces ensembles d'aretes, consideres comme connus 
a priori dans la litterature precedente, affectent fortement les performances 
atteintes ; nous introduisons et comparons une famille d'heuristiques pour leur 
selection. Des exemples numeriques sont presentes, montrant les importantes 
ameliorations dans l'efhcacite qui peuvent etre obtenues par le chainage de ces 
heuristiques avec le plan d'echantillonnage de Monte Carlo propose. 

Mots-cles : heuristiques, pathset, cutset, Monte Carlo, evenements rares, 
fiabilite, performabilite, longueur bornee, restrictions de diametre 
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1 Introduction 

Consider a communication network whose components randomly fail, modelled 
as an undirected graph. The most classical reliability analysis model assigns to 
the network two possible states determined by the link states; it is operational if 
and only if a certain set of distinguished sites, known as terminals, are connected, 
otherwise it is failed. A generalization introduced in |T2] imposes also an upper 
limit in the allowed distance between terminals for the network to be considered 
as operational. This generalization, conceived to model situations where limits 
exist in the acceptable delay times or the number of hops undergone by data 
packets, keeps the binary nature of the network state. But when in comes to 
performability, in several contexts there is need to employ metrics defined over 
a larger number of network states, characterised by the hop distance between 
terminals. For example, in voice-over-IP applications, the perceived quality is 
affected by latency, which is in turn determined by the number of links tra- 
versed by packets. Quality deteriorates as high hop-distance states occur more 
frequently in the network. In web applications with rich interfaces, the quality 
perceived by the end user is related to responsiveness, where latency determines 
the delay between the user actions and their effect on the output interface. The 
same consideration applies to other contexts where costs are born each time a 
link is traversed e.g. vehicle or packet routing, where costs can relate to time 
spent, tolls, fuel, etc. 

Classical reliability analysis consists of computing, estimating or bounding 
the probability that the network is operational, that is, the expected value of 
the binary variable associated with the network state. Surveys can be found 
in [7], [T3] and [I3j. In particular, many Monte Carlo methods have been pro- 
posed for efficient estimation of this expected value. This article introduces a 
Monte Carlo method to estimate any performability metric that is defined as 
the expected value of a distance-dependent network metric. In real communi- 
cation networks, link reliabilities are normally very high. Then, when applying 
simulation methods, sampling a 'high distance' network state is a rare event. 
In the context of Monte Carlo simulations for network reliability analysis, once 
fixed a certain confidence interval goal, the needed sample size unboundedly 
grows as link reliabilities become higher. Several variance-reduction techniques 
can be used to reduce the sample size; surveys can be found in [T3], [3] and 
[5]; more recent works include @], 0, [TT], [TB] and [2J. [TU] and [S] introduced 
a family of methods for the classical reliability, based on sampling strategies 
conditioned by paths and cuts. In [B] it is shown how to extend them to include 
diameter constraints, employing sets of edges named d-pathsets and d-cutsets. 
The Monte Carlo method hereby proposed generalises these methods in order 
to estimate the expected value of a random variable determined by the maximal 
distance between pairs of terminals. 

The d-pathsets and d-cutsets, that were considered as a priori known in the 
previously mentioned literature, heaviliy affect the performance attained by the 
simulation methods. We hereby introduce and compare a family of heuristics 
for their selection. We present numerical evidence of the significant efficiency 
gains attained by chaining these heuristics to the proposed simulation method 
when compared to a crude Monte Carlo simulation. 

The remainder of the article is organised as follows. Section [2J includes 
definitions, notation and model formalisation. Section [3J describes the crude 
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Monte Carlo method and the estimators it gives for the metric under study as 
well as for its variance. Section U] describes the suggested Monte Carlo method 
and shows the variance reductions achieved relative to the crude one. Section [3] 
describes the heuristics for selecting the (d-)pathsets and (d-)cutsets that will 
be applied in the simulation. Section [5] presents three numerical examples, 
based on mesh-like networks. It compares several variations of the heuristic, 
the relative efficiency of the proposed Monte Carlo method vs. the crude one 
and how it is influenced by link reliabilities. Finally, conclusions and further 
work are summarised in Section [7] 

2 Definitions and Notation 

The network is modelled by an undirected graph G = {V,E) with n = \V\, 
m = \E\ and E = {ei...e m }, whose nodes and edges correspond to the sites and 
links of the network respectively. The following definitions and notation are also 
employed: 

• K C V: a subset of nodes that corresponds to the distinguished sites 
(called terminal nodes or simply terminals); 

• X e : for every e&E, a random binary variable whose value is 1 if e operates 
and otherwise; 

• r e : reliability of e (the probability that it is operational at any given 
instant). The edges are assumed to fail independently of one another; 

• X = (Xi...X rn ) 6 {0, l} m : a network configuration (an m-tuple encoding 
the states of all edges); 

• X: set of the 2 m possible network configurations; 

• Tt(x) = Vy(X — x) (probability that the random network configuration is 

x); 

• A : X — > {0, n, oo}: the function that gives the maximum distance A.(x) 
between two terminals in the partial graph of G encoded by a network 
configuration x; 

• K is d-connected in G if and only if, for each pair of nodes of K, there is 
a path between them, whose length is not above the integer d; 

• $: network parameter to estimate (random variable determined by the 
network configurations) . 

Our goal is to estimate the expected value of the random variable $. The 
value of this random variable is determined by the network configurations as 
follows. The set {0, ...,n, oo} (the codomain of A) is partitioned into several 
intervals (think of them as 'quality levels'). Then each interval is mapped to one 
value from a set Q C R with | Q\ < r+1. Therefore every network configuration x 
belongs to an unique 'quality level' that corresponds to a certain $ value. Then, 
our final aim is to estimate the expected value of the function $ : X — > Q. 

Figure Q] illustrates the model and notations used. Given r integers < 
do < ■ ■ ■ < d r -i, let A = {Aq U Ai U • ■ • U A r } be a set of intervals where 
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Ao = (Q, G?oL Ai = for i G [l,r — 1] and A. r = (d r _i,oo]. Let X be 

partitioned (and its components called regions) as X = >^ q U • • • U X r where 
Xi = {x £ X | A(x) s Ai} for i e [0,r]. Let pi denote the probability that 
x is one of Xi. Let also Zi denote the probability that x is one of Zi, being 
Zi C X l defined in Section g] Similarly, Q = {$ , $1, . . . , $r} and <i> : A" ->■ Q 
gets defined by $(ar) = $j A(x) e Ai(x). 



X 


: ^0 i 


: ^1 j 




: z r _i ; 








pi 




Pr-l 




A 


(0,do] 


(d ,dt] 




{dr-2, 
dr-l] 


oo] 


regions 


Xq 


X x 




X r -i 




$ 


$ 











Figure 1 : Partitions of the network configuration space 



3 Crude Monte Carlo Method 

A crude Monte Carlo simulation estimates the expected value <£> of <3? by inde- 
pendently sampling N network configurations x^\ . . . ,x^ N \ computing $ for 
each one, and building an estimator 

i=l..N 

whose variance is (EV(-) denotes the expected value) 

i=l. .JV 

Sampling each x^ involves m Bernoulli trials for determining the state of 
each edge. Computing $ for a;W is done by applying a breadth-first-search 
(BFS) algorithm starting at every node of K. So each iteration takes a time 
that is 0(max{?7j, i"T| 2 }). 

4 Proposed Monte Carlo Method 

Suppose that certain topological knowledge about G = (V, E) is available in 
the form of certain edge sets, called pathsets, cutsets, di-pathsets and di-cutsets 
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for a given i G [0..r — 1], that we define next. Let P be any subset of P and 
G" = (V, P) the partial graph of G yielded by P. Then, 

• P is a pathset if and only if if is connected in G'; 

• P is a d-pathset if and only if if is d-connected in G'; 

• a (d-)pathset is said to operate when all of its edges operate. 

Similarly let C be any subset of E and G 1 = (V,E\ C) the partial graph of 
G yielded by E \ C. Then, 

• C is a cutset if and only if if is not connected in G"; 

• C is a d-cutset if and only if if is not d-connected in G'; 

• a ((i-)cutset is said to fail when all of its edges are failed. 

Hereafter, unless otherwise specified or clear by the context, the terms path- 
set and cutset refer to any of the above defined sets, regardless of the presence 
or absence of a length constraint. In our context, an elementary event is a net- 
work configuration X, with operational edges {e G E/X e = 1} and failed edges 
{e G E/X e = 0}. The sets of operating/failed edges define whether a given 
pathset /cutset operates/fails. 

Under certain circumstances, the simultaneous occurrence of operating/failed 
sets allows to know the value of $ for a given network configuration. For ex- 
ample, suppose that a certain network configuration X is such that a given 
5-pathset operates while a given 2-cutset fails. It follows that the maximal 
distance between the nodes of if must be any of {3,4,5} in the partial graph 
encoded by X. If the interval (2,5] belongs to A, then the region to which X 
belongs is known, and so is its $ value. The proposed method takes advantage 
of this property as we see next. Assume that the following sets of edges are 
known: 

• P : set of some pathsets; 

• Pi, ... , V r -i- r — 1 sets such that each P, is a set of some -pathsets; 

• P r = (for convenience of notation); 

• Co = (for convenience of notation); 

• C\, . . . , C r -\: r — 1 sets such that each d is a set of some dj-i -cutsets; 

• C r : set of some cutsets. 

In the previous definitions, the word 'some' means that every set can contain 
any number of elements ranging from zero to the maximum existing number of 
(di-)pathsets or (rfj)-cutsets. At least one of the sets must be non-empty for the 
method to be useful; if all sets are empty then the method coincides with crude 
Monte Carlo. Now, the following events can be defined over X: 
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Ti = (some element of Vi operates) 
}Cj = (some element of Cj fails) 
Zq = 7o 
Zh =% AfCh 

2/ p — rC'ir- 



(h = !,-■■ ,r-l) 



(» = 0,.-. ,r-l) 
(j = !,••■ ,r) 



With these definitions, given a network configuration x. it holds that (Zq — > 
(A(x) G (0, d ])) and that (Z; -)• (A(x) G for i G [1 . . . r]. In other 

words, each event Zh = o-.. r determines a precise $ value. The events Zq, . . . , Z r 
are subsets of Xq, . . . , X r respectively and thus pairwise disjoint too. This is also 
shown in Figure [T] where pi = Pr(x G Xi) and Zi = Pr(x G Zi) for i G [0 . . . r]. 

In what follows all summations have an implicit subscript i = 0, . . . , r. Sup- 
pose that the probabilities zq, . . . , z r are easy to compute. Then it is easy to 
compute cj) = J2®i z i> the part of i> for which Z accounts. The method de- 
scribed below is based on computing the remaining part of «3> (the one given 
by the events out of Z) by restricting the sampling space to X\Z. Let z = 
Pr(x G Z) = J2 z i- Our sampling plan estimates $ by sampling N network 
configurations within X\Z (with a probability distribution that 

respects the relative probabilities among the network configurations in X\Z), 

computing $ for each one, and building an estimator $n with variance as 
follows: 



Here the expected values involve probabilities conditioned to X\Z, hence 
the application of the correction factor 1/(1 — z) to pi — Zi. 

4.1 Variance Reduction 

The variances obtained through the crude and the proposed Monte Carlo meth- 
ods are next compared; for simplicity it is done for one single iteration (i.e. the 
simulation sample size is one). Single-index summations are over i G {0, . . . , r} 
and double-index summations over all pairs (i,j) G {0, . . . , r} 2 . 




whose variance is 



N — 





(EV($> 2 ) - EV 2 ($)) 
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£ ^Pi + (i - z) y $2 fe - z i) - 2 ^ + 4> 2 = 

i i 

z Y ^ - z Y *? * + Y 

i i i 

2 (E $ ^)(E ^ + E $ ^) 2 = 

2 i i 

Y ,| '- - X '''< + E ,|, "/'' : ./ 

ij ij ij 

2 E ij^jPiZj + Y ij^i^jZiZj = 

Y^ ~ ®jfPi Z j + Y^^ ~ ®D Z i Z j = 
ij ij 

Y($i-$jf(Pi-Zi>j + 

ij 

£ - $,) 2 + - $2) z . z . = 

ij 

»j 

Y^ - $ j - ) 2 (Pi - + E( $ j - $ <) 2 ^-- 

ij i<j 

The following Lemma characterises the variance reduction in terms of the 
regions and their respective subsets determined by the set of pathsets and cut- 
sets. 

Lemma 1 The difference of the variances a\ — af is always non-negative. Re- 
garding strict positivity, it is a necessary and sufficient condition that two non- 
empty regions Xi and Xj exist such that their $ values are different and Zj is 
non-empty. 

Proof. Observe that both summations in the final form of expression of a\ — a\ 
only involve non-negative terms, hence the difference of variances is always non- 
negative. Regarding strict positivity assume that two non-empty regions Xi 
and Xj exist such that $i ^ <E>j , pi > and Zj > (therefore pj > 0) . Then, 
if pi > Zi, the first summation will have a strictly positive term given by the 
subindices i,j. If pi = Zi, then > and therefore the second summation will 
have a strictly positive term given by the subindices This proves the suffi- 
ciency of the statement about Xi and Xj. Conversely, assume that o\ — o\ > 0. 
Then there must exist i, j such that at least one of the corresponding terms 
in the first and second summation is strictly positive. If the term for the first 
summation is strictly positive, then $i > <f>j, pi > Zi and Zj > 0. Since 
(pi > Zi — > pi > 0) and (zj > — > pj > 0) the statement about Xi and Xj holds 
true. If the term for the second summation is strictly positive, then $i > $j, 
Zi > and Zj > 0. Again, Pi > and pj > and the statement holds true. 4 
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4.2 Sampling within X \ Z 

Let SI = ljj =0 r (ViUCi) be the set of all edges occurring in at least one pathset 
or one cutset under consideration. Note that the event Z is independent from the 
state of all edges that do not belong to SI. Then the state of the edges of E\£l can 
be easily sampled with m — |S1| Bernoulli trials. A general sequential procedure 
to sample the state of the edges of SI is the following. Let (oi . . . o\q_\) € {0, l}l n l 
an |Sl|-tuple encoding the sampled states. Assume that o\...o r have already 
been sampled (being r < |S1|) and let E r be the event in which the first r edges of 
SI have these sampled states respectively. Then the probability that the r + 1-th 
edge of SI is operational is (knowing that the edges of SI must have states such 
that the network configuration belongs to X \ Z): 

Pl(0r+I = \ \ X \ Z A E r ) = P r( . + , - l) g^f^g = " 

where Pr(o r+ i = 1) is the reliability of the r + 1-th edge of SI. The function 
that given the reliabilities p\, . . . , of the edges of SI returns the probability 
of the event X\Z, is a polynomial P(/?i, . . . , p\n\)- So, computing Yr(X\Z \ E r ) 
involves replacing pi, . . . , p r by or 1 according to the states sampled for the 
first r edges and then evaluating P. Finding (and evaluating) P can be very com- 
plex when pathsets and cutsets of the same region highly overlap. Observe that 
Pr(X\Z) = l-Pr(Z) = l-Pr(A i -Zi) = l-£i Pr (7» A ^Q) (recall the pairwise 
independence of all Zi). To compute each Pr(7I A ICi), if every edge involved 
occurs only in one of Vi and Ci then it is possible to get factorised expressions 
(see the Appendix in and then building and evaluating the polynomial can 
be done in time 0(|S1|). The previous considerations about the overlapping of 
edge sets within the same region also apply to computing zq, ■ ■ ■ ,z r . 

For limited cardinalities of SI an alternative approach can be used, consisting 
in precomputing the probability of the occurrence of each of the 2^ possible 
sub-configurations that exist when only considering the edges of S7, in 0(2' n ') 
time. Then, sampling their states just involves choosing a sub-configuration at 
random through a cut-point access on a table accumulating the precomputcd 
probabilities (thus in 0(|S1|) time). This is the fastest way to sample the states 
for the edges of SI, but at the expense of the exponential-in-|Sl| effort for pre- 
computing the table, that can limit its applicability on large networks. 

5 A heuristic for pathset and cutset generation 

In this section we introduce a heuristic that generates pathsets and cutsets for 
every region Xi. We develop an algorithm for the two-terminal problem that 
can be easily generalised to the problem for general sets K. For every i, we 
build a set of d^-pathsets and di_i-cutsets, such that no two elements share 
edges. This set should ideally be the one that maximises the probability Zi that 
one rf-pathset operates and one e?-cutset fails at the same time, to attain the 
largest variance reduction that is possible. The basic idea consists of a greedy 
randomised generation of paths with lengths in dj\, followed by a greedy 

randomised generation of di_i-cutsets, without using any edge included in the 
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former paths. The cycle is repeated several times and the combination of sets 
that yield the higher probability is finally chosen. 

5.1 Paths generation 

The algorithm shown in Fig. [5] receives the graph G, source and destination 
nodes (s,t) and the minimum and maximum distances £±,£2 that define the 
region. It returns a random path whose length is in the interval [£1,^2] or 
the null (_L) element if no such path could be found. The algorithm proceeds 
with a greedy selection of nodes (currNode) that are added to the path under 
construction newPath. It starts by selecting s and ends after reaching t or 
achieving a point where it is impossible to complete a path with the required 
constraints. In each iteration, the shortest path (shP) between the current 
node and t is computed. If there is no such path, or its length exceeds the 
difference between £2 and the length of the already built part, the algorithm 
returns the null element _!_. Otherwise, the next edge can be either the first 
one of shP, or another one different from it. When there are more than one 
feasible edges, the first one of shP is chosen with a probability proportional to 
the ratio between the length of shP and the maximum number of edges that can 
be added to the path under construction without violating the [^1,^2] constraint. 
The function rand ( ) in the algorithm returns a random uniform real number in 
[0,1). Therefore the algorithm tends to stick to a shortest path when newPath 
has reached a length that leaves few chances to divert. On the contrary, when 
there is still room for many more edges than the length of a shortest path, the 
algorithm will tend to choose, at random, other directions to extend newPath. In 
line 11, care is taken to not repeat a node of shP, which would result in having 
cycles within the newPath. Lines 13-16 remove the chosen edge from G for 
future iterations, append it to newPath and update its length £ and currNode. 
Random is therefore introduced in the decision of wheter to make a step in the 
direction of a shortest path or not, as well as in the selection of the next node, 
in case it was decided not to follow the shortest path. 

5.2 Cutsets generation 

The algorithm for creating an ^-cutset given a certain integer £ is shown in 
Fig. [3] It receives the graph G, the source and terminal nodes (s,t), the integer 
I and a set of edges H. It starts by building a first-in-first-out queue with 
all edges of G \ H inserted in increasing order of distance to s. Random is 
introduced by shaking these distances prior to insertion. For example, in our 
tests we swaped each pair of values in d with a probability inversely proportional 
to their difference. The queue is later used to add each edge to the cutset under 
construction newCut in this shaken order; the idea behind this is that dropping 
edges in the vecinity of s is a good strategy to find low cardinality ^-cutsets 
for s, t. The set H will be used when invoking this algorithm to avoid using 
edges already used for other ^-pathsets or ^-cutsets found prior to the one under 
construction. The while loop proceeds adding edges to newCut and dropping 
them from G until one of the following occur: i) the distance between s and t is 
greater than £ (so we have an ^-cutset); or ii) the queue is empty (so no ^-cutset 
exists, which happens if and only if there was a path whose length is not above 
I built exclusively with edges of G\ H). After the while loop, newCut is an 



RT 11° 8267 



Performability Estimation with Edge-sets Heuristics 



11 



Procedure generatePath(G, s, t, £\,£-i) 

1: I 0; currNode <— s; newPath 
2: while currNode 7^ t do 

3: shP <— shortestPath(G, currNode, t, newPath) 

4: dist <— lcngth(shP) 

5: if (shP = _L) V (£ + dist > £ 2 ) then 

6: return _L 

7: else 

8: if (currNode has only one neighbour in G) V (rand() < (£ + dist — 

4)/(^2 - tx)) then ' 
9: next <— neighbour of currNode in shP 

10: else 

11: next <— any neighbour of currNode in G \ shp 

12: end if 

13: remove the edge (currNode, next) from G 

14: append the edge (currNode, next) to newPath 

15: currNode <— next 

16: £ <r- £ + 1 

17: end if 

18: end while 

19: return newPath 

Figure 2: Heuristic algorithm for generating paths with lengths in [€1,^2] 

^-cutset not necessarily minimal (i.e. some edges can be dropped from it and 
still be an £-cutset). The for loop builds an f-cutset minCut that is minimal in 
this sense (although, in general, it will not be a minimum-cardinality ^-cutset). 

5.3 The main heuristic 

Our main algorithm iterates until a given amount of time is spent. In each 
iteration a disjoint set of edges for a certain region Xi is generated. Each 
iteration will begin with the generation of one di-pathset and one c£i_i-cutset. 
It then will continue adding more sets, following a certain sequence that defines 
a "version" of the algorithm. Each iteration may not exceed a certain parameter 
time MAX_TIME; if it does then it is discarded and a new iteration is run. For 
each generated sets P and C of <ii-pathsets and di_i-cutsets, the probability 
of the event Zi that they define is computed; the P,C pair with the highest 
Pr(zi) is recorded as the algorithm proceeds and returned after timing out. The 
algorithm shown in Fig.|¥]corresponds to the version where the sequence pathset- 
cutset-pathset-cutset is followed in each iteration; we will denote it as PCPC. It 
illustrates how to invoke the procedures generatePath and generateCutset. 
In Section 16.21 we compare the results obtained with seven different versions. 
The algorithm receives the graph G, source and destination nodes (s, t) and 
the range of distances allowed for the zone \t\ , £2} ■ It returns the pair (P, C) 
whose probability was the highest among all the pairs generated. The pair 
is not necessarily built by two pathsets and two cutsets. Note the way that 
generatePath and generateCutset are invoked in lines 13 and 19. Passing 
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Procedure generateCutset(G, s, t, £, H) 

1: d 4- vector of distances between s and every node in G \ H 

2: randomAlter(rf) 

3: queue 4- all edges of G\H inserted in increasing order of their values in d 

4: newCut 4— 0; flag 4— true 

5: while flag do 

6: shP 4- shortestPath(G, s, t, 0) 

7: if (shP = _L) V (length(shP) > i) then 

8: flag 4— false 

9: else if isEmpty(queue) then 

10: return _L 

11: else 

12: newEdge pop(queue) 

13: remove newEdge from G 

14: add newEdge to newCut 

15: end if 

16: end while 

17: minCut 4- 

18: for all e e newCut do 

19: add e to G 

20: shP 4- shortestPath(G, s, t, 0) 

21: if (shP ^ _L) A (length(shP) < I) then 

22: remove e from G 

23: add e to minCut 

24: end if 

25: end for 

26: return minCut 

Figure 3: Heuristic algorithm for generating an ^-cutset 
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Procedure PCPC(G, s, t, £ u £ 2 ) 

1: bestP <- _L; bestC 4- ±; highestProb <- 

2: while ellapsed_time < MAX _ TIME do 

3: P <- 0; C <- 

4: repeat 

5: p «— generatePath(G, s, t, Zi, Z2) 

6: until (p ^ ±) or MAX _ TRIES attempts were done 

7: if (p = _L) then continue //aborts current "while" iteration 

8: P^PU{p} 

9: repeat 

10: c <— generateCutset(G, s, t, l\ — 1,P) 

11: until (c ^ _L) or MAX _ TRIES attempts were done 

12: if (c = _L) then continue // aborts current "while" iteration 

13: C^CU {c} 

14: if Pr(P,C) > highestProb then 

15: highestProb <- Pr(P,C); bestP <- P; bestC <- C 

16: end if 

17: repeat 

18: p <— generatePath(G \ P, s, t, l\, l 2 ) 

19: until (p ^ ±) or MAX _ TRIES attempts were done 

20: if (p = _L) then continue //aborts current "while" iteration 

21: P^PU {p} 

22: if Pr(P,C) > highestProb then 

23: highestProb <r- Pr(P,C); bestP P; bestC <- C 

24: end if 

25: repeat 

26: c generateCutset(G, s, i, h - 1, P U C) 

27: until (c ^ _L) or MAX _ TRIES attempts were done 

28: if (c = _L) then continue // aborts current "while" iteration 

29: C^CU {c} 

30: if Pr(P,C) > highestProb then 

31: highestProb <- Pr(P,C); bestP <- P; bestC <r- C 

32: end if 

33: end while 

34: return bestP, bestC, highestProb 

Figure 4: Pseudo-code for the main heuristic; version PCPC 

G \ P and P U C allows respectively to obtain disjoint pathsets and cutsets 
respect to those so far generated in the current iteration. If a subprocedure is 
invoked MAX_TIMES times without succeeding to return a pathset or cutset, 
a new iteration is started. Similar algorithms PP, PPP, ... and CC, CCC, ... are 
used for the border regions Xq and X r , generating only pathsets or only cutsets 
(with no length constraint). 
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A 


region 


r e = 0.9 


r e = 0.95 


r e = 0.99 


up to 5 


Xo 











6 to 7 


X x 


5 


30 


1,000 


above 7 


x 2 


10 


60 


2,000 


disconnected 


X z 


20 


120 


4,000 



Table 1: Test 1: fines per region. 

6 Numerical examples 

This section provides numerical examples based on mesh-like topologies. The 
simulations are inspired on the following situation. There is a contract between 
a communication network provider and a customer who needs to periodically 
exchange data between two sites s and t. They agree on a scale of fines, to 
be paid by the provider, according to the number of hops that each packet 
undergoes. The aim is to estimate the expected value of the fines that will be 
paid during the contract lifetime. To do so, simulations based on crude Monte 
Carlo and the proposed method were run and their results compared. The 
methods were implemented in C++ and the tests were run on an Intel Core2 
Duo T5450 machine with 2 GB of RAM, executing 10 7 iterations (sample size). 

6.1 Test case 1 - ANTEL's transport network 

Test case 1 is based on the countrywide transport network topology of ANTEL, 
the largest telecommunications provider in Uruguay, shown in Figure [5] Nodes 
represent the sites whose interfaces perform routing activity, thus adding sig- 
nificant latency. Links represent the existing paths between these sites. Three 
scenarios are considered, corresponding to interface failure probability values of 
0.10, 0.05 and 0.01 respectively. The test illustrates the effect that the rarity of 
edge failures has on the attained efficiency relative to crude Monte Carlo. As- 
sume that nodes 4 and 14 (shown as squares) are to be connected in a context 
where low latencies are desirable. Table Q] lists the scale of fines payed accord- 
ing to the hop distance between both nodes. Three scales are used, each one 
corresponding to a certain value of link reliability in the network model (0.90, 
0.95 and 0.99). The simulations will estimate the expected value of the average 
fine as well as its variance. The fine scales were proportionaly adjusted so that 
the expected values of the fines to pay were rather similar, by running short 
simulations. Note how the fines per region must quickly increase to yield the 
same expected value of fines when link reliabilities become higher, due to the 
fact that network configurations with high distances, or disconnected, become 
rarer events. In other words, this means that the provider can agree on pay- 
ing higher fines per region still facing the same fine expected value, because of 
improved link reliabilities. Table [5] shows the pathsets and cutsets employed, 
using the edge labels of Figure [S] 

Table [3] shows $ and $ (the expected values of $ estimated by the crude 
and proposed methods respectively); the estimations obtained for a 2 and a 2 ; 
and the total times (in seconds). As above mentioned the scale of fines was 
set up so that the expected values of the fines to pay were approximately the 
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Figure 5: Test 2: reduced transport network topology of ANTEL, Uruguay's 
national telecommunications provider 



reg. 






Xo 


Vo = {{4, 5, 9,1}, {11, 12, 2, 8}} 


Co =0 


Xi 


Vi = {{11,12,2,13,14,15}} 


Ci = {{1,8}} 


x 2 


P 2 = {{4, 17, 10, 18, 19, 
20,21,22, 15}} 


C 2 = {{1,8,13}} 


x 3 




C 3 = {{3, 4, 11}, {1,8, 15}} 



Table 2: Test 1: pathsets and cutsets. 



same (they ranged from 0.341855 to 0.361926). Times spent are essentially the 
same across the three reliability scenarios, with the crude method taking a time 
approximately 11% lower than the proposed method. Observe the significant 
reductions achieved in the variance by the proposed method (13.63, 45.27 and 
940.2 for r e equal to 0.90, 0.95 and 0.99 respectively). An efficiency compar- 
ison can be reported via the relative efficiency RE, which is a standard ratio 
employed in simulation literature (see e.g. [5], [H]), defined as follows: 

var crude time cru( i e O time C rude 
RE — x — « — x — 

Var proposed i^^^proposed fj^ time proposed 
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Crude 


Proposed 


ratio 


r e = 0.9 




$ 


0.341855 


0.341856 




a 2 


4.538235x10"'' 


3.328820x10"* 


13.63 


t( B ) 


290.9 


323.7 


0.8987 


r e = 0.95 




$ 


0.361926 


0.361632 




a y 


2.467931 xl0" b 


5.451282x10"* 


45.27 


t(s) 


285.8 


321.3 


0.8896 


r e = 0.99 




$ 


0.338000 


0.334667 




a a 


6.018858X10" 5 


6.401462 xl0" s 


940.2 


t(s) 


280.4 


321.7 


0.8715 



Table 3: Test 1: numerical results. 



Instance 


Size 


s 


t 


li 


e 2 


Pr(*i) 


Gridl 


8x8 


(2,2) 


(4,4) 


6 


10 


1.895E-4 


Grid2 


8x8 


(2,2) 


(5,5) 


8 


12 


1.974E-4 


Grid3 


15 x 15 


(4,7) 


(11,7) 


10 


20 


2.529E-6 


Grid4 


15 x 15 


(4,4) 


(11,11) 


18 


24 


9.575E-9 



Table 4: Test 2: grid-based instances for heuristic tests. 



The RE expresses the attained variance reduction adjusted by the spent- 
time ratio; in this case, its values are 12.25, 40.27 and 819.4 respectively for 
r e = 0.90, r e = 0.95 and r e = 0.99. 

6.2 Test 2 - Square grids 

This test illustrates the behaviour of seven versions of the heuristic algorithm. 
One version (that we call PC) always returns one pathset and one cutset. Two 
versions (PCP, PCC) can return one extra pathset or cutset respectively. Fi- 
nally, four versions (PCPP, PCPC, PCCP, PCCC) return two, three or four 
components whose nature corresponds to each letter. Two network topologies 
were employed: square grids with 8x8 and 15 x 15 nodes respectively. Table [4] 
shows the characteristics of the four instances of the problem that were run 
for each version of the algorithm. Nodes s and t are specified by their u x,y 
coordinates" in the grid (numbered from zero). The reliability of each edge 
was randomically set, according to a triangular distribution (0.985, 0.99, 0.995). 
The parameters MAX_TIME and MAX _ TRIES were set to 40 seconds and 5 
tries. Last column of the table shows the highest probability found among those 
returned by each version of the algorithm. 

Table [5] shows the results of the four tests; within each one, the algorithms 
are sorted by descending order of the probability that each one returned. Col- 
umn labelled %best reports the ratio between the returned probability and the 
highest one for that particular test. Column hedges reports the total number 
of edges involved in the pathsets and cutsets of the returned solution. Columns 
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%best 


#edges 


2 sets 


3 sets 


4 sets 


best (P,C) 


Gridl 




PCCP 


1.000 


22 


1,055 


726 


687 


2,2 


PCPC 


0.998 


24 


671 


637 


255 


2,2 


PCCC 


0.911 


18 


497 


336 


71 


1,3 


PCC 


0.911 


14 


1,086 


729 





1,2 


PCP 


0.509 


18 


3,186 


3,052 





2,1 


PCPP 


0.501 


18 


2,635 


2,518 


943 


2,1 


PC 


0.465 


10 


4,135 








1,1 


Grid2 




PCPC 


1.000 


28 


417 


401 


152 


2,2 


PCCP 


1.000 


28 


604 


392 


377 


2,2 


PCCC 


0.898 


20 


463 


299 


266 


1,3 


PCC 


0.898 


16 


605 


387 





1,2 


PCP 


0.502 


22 


2,012 


1,907 





2,1 


PCPP 


0.501 


24 


1,709 


1,623 


621 


2,1 


PC 


0.458 


12 


2,439 








1,1 


Grid3 




PCCC 


1.000 


26 


82 


55 


43 


1,3 


PCPC 


0.772 


40 


79 


61 


30 


2,2 


PCCP 


0.769 


42 


105 


73 


54 


2,2 


PCC 


0.667 


23 


110 


74 





1,2 


PCPP 


0.394 


56 


286 


196 


59 


3,1 


PCP 


0.386 


37 


310 


204 





2,1 


PC 


0.340 


18 


380 








1,1 


Grid4 




PCP 


1.000 


50 


169 


105 





2,1 


PCPP 


0.996 


52 


180 


98 


64 


2,1 


PCPC 


0.996 


52 


56 


27 


13 


2,1 


PCCP 


0.988 


62 


31 


21 


12 


2,2 


PC 


0.821 


28 


202 








1,1 


PCC 


0.821 


28 


37 


17 





1,1 


PCCC 


0.821 


28 


28 


17 


12 


1,1 



Table 5: Test 2: ranking of results for the grid-based instances. 

2s, 3s and 4s report the number of solutions generated that had two, three 
and four components (pathsets plus cutsets). Finally column best(P,C) reports 
the number of pathsets and cutsets in the solution returned by each algorithm. 
First, note that in all cases there is a significant difference between the best and 
worst returned solutions (their ratio ranging from 1,22 to 2,94). Second, three 
of the best solutions had four components and the remaining one had three. 
These results suggest that it might be worth to spend time looking for solutions 
with many components, rather than striving to get the best "one pathset - one 
cutset" possible solution, in topologies alike. Third, there is no clear winner, 
although PCCC, PCPC and PCCP seem to have the best overall results. 
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A 


region 


r e = 0.90 


r e = 0.95 


r e = 0.99 


up to 5 


Xq 











6 to 12 


X X 


10 


150 


85,000 


above 12 


x 2 


1,000 


15,000 


8,500,000 



Table 6: Test 3: fines per region. 





Crude 


Proposed 


ratio 


r e = 0.90 




$ 


0.928540 


0.928373 




a 2 


9.135802xl0~~ b 


2.591886xl0~ b 


35.25 


t(s) 


606.4 


975.2 


0.6218 


r e = 0.95 




$ 


0.933135 


0.915608 




a 2 


1.366002 xl0" a 


6.188060xl0 _b 


220.75 


t(s) 


586.2 


950.4 


0.6168 


r e = 0.99 




$ 


0.935000 


0.928744 




a 2 


7.232224X10" 1 


3.217148xl0 _b 


22,480.23 


t(s) 


513.1 


914.4 


0.5612 



Table 7: Test 3: numerical results. 

6.3 Test 3 - Randomised extension of Arpanet 

This test illustrates the combined effect of the heuristic algorithm for path- 
set and cutset generation and the variance-reduction technique. The network, 
shown in Fig [5] has 60 nodes and 110 edges. It was generated by growing the 
original Arpanet with 40 nodes according to the random network model of [1J. 
As in Test 1, three instances were run, each one with all edges set to the same 
reliability. In this case the nodes s,t are represented as squares. Table [5] lists 
the scale of fines, split in three zones; again they were adjusted so that the ex- 
pected value was similar for the different edge reliabilities. The heuristic PCCP 
was applied for X± and it returned a solution built by one 12-pathset and two 
5-cutsets. Heuristics PPPP and CCCC where applied to define the zones Xq 
and X2, returning respectively three pathsets and two cutsets. The parameter 
MAX _ TIME was set again to 40 seconds. The results of the test are sum- 
marised in Table [7] each time reported for the proposed method include the 120 
seconds spent generating the pathsets and cutsets. Note the significant relative 
efficiencies, in particular for rarer failures, obtained in this test: 21.92, 136.15 
and 12,615.39 respectively for r e equal to 0.90, 0.95 and 0.99. 

7 Conclusions and future work 

The proposed simulation method showed its capability to achieve significant 
variance reductions when applied on mesh-like networks. The precise conditions 
under which there is a reduction in the variance of the estimated parameter, 
with respect to the crude method, were shown in Lemma 14.11 The tests also 
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Figure 6: Network for Test 3 - Random extension of Arpanet 



showed that the proposed heuristic was able to generate sets that exploited the 
mentioned variance-reduction potential at significant levels. Even considering 
the extra time required for running the heuristic and for sampling with the 
proposed plan, the efficiency gains are noteworthy, specially when the links 
become more reliable. 

The heuristics hereby introduced, yet resulting in important efficiency gains 
when chained to the simulation in these simple versions, can be improved in 
several ways. The algorithm could adapt the amount of effort spent in searching 
higher cardinality sets of pathsets/cutsets according to statistics on the sets so 
far found or on the connectivity level of the terminal set. It could also alter the 
relative effort devoted to the generation of different sequences of pathsets and 
cutsets, reacting to the results so far obtained during execution. The reliability 
of each edge should be taken into account when choosing which one to add 
to the pathset or cutset under construction, particularly in networks where the 
reliabilities significantly differ. The time spent generating the sets MAX_TIME 
could also be initially set according to the sample size and the number of edges 
and regions (all of which determine at a large extent the time spent by the 
simulation). Moreover, it could be adjusted during the algorithm execution in 
light of the number and quality of the so far found sets. 
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